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The  basic  research  objective  of  this  Young  Investigator  proposal  has  been  to  determine  the  best  architectural 
modalities  to  insert  optics  technology  into  future  system-on-chip  (SoC)  platforms  of  interest  to  the  Air 
Force,  in  order  to  meet  projected  on-chip  data  transfer  rates  that  far  exceed  what  is  possible  today.  The 
research  has  aimed  to  complement  advances  in  optoelectronic  devices  and  supportive  materials  with 
innovative  architectural  designs  that  integrate  these  components  according  to  system-wide  application 
needs.  The  two  main  goals  of  this  projects  are:  (i)  Design  new  optoelectronic  network  topologies  and 
protocols  to  effectively  combine  multiple  stacked  layers  of  heterogeneous  optical  interconnects  with 
electrical  interconnects  for  application-specific  performance  enhancements;  and  (ii)  Devise  new  techniques 
for  memory-access  enhancements  with  optically  connected  DRAM  modules  and  mechanisms  for  energy- 
efficient  runtime  reconfiguration  of  optoelectronic  components  in  emerging  SoC  platforms.  For  validation, 
we  aimed  to  realize  a  physical-level  implementation  of  the  optoelectronic  network  to  enhance  the  operation 
of  a  SoC  application  for  information  and  image  processing  and  provide  insights  into  the  behavior  of 
optoelectronic  interfaces  and  optical  devices  at  the  system  level. 


Summary  of  Research  Accomplishments 

This  project  has  not  only  succeeded  in  accomplishing  its  two  main  goals  but  also  has  resulted  in  laying  the 
foundation  for  solving  new  problems  and  challenges  in  related  areas.  The  only  shortcoming  has  been  the 
physical-implementation  based  validation.  Despite  our  best  efforts  working  with  OpSIS  foundry,  we  were 
unable  to  fabricate  complex  photonic  architectures,  due  to  lack  of  maturity  in  the  silicon  photonics 
fabrication  process,  while  OpSIS  was  in  operation  (till  Feb  2015).  However,  we  have  extensively  validated 
all  of  our  designed  integrated  silicon  photonics  architectures  and  circuits  using  analytical  modeling, 
physical-level  silicon  photonics  design  tools  from  IPKISS  and  Lumerical,  and  detailed  SoC-level 
simulations  with  parameters  derived  from  real  fabricated  chips. 

The  project  has  supported  4  PhD  students,  3  MS  students,  and  7  undergraduate  students,  either  partially  or 
entirely,  to  accomplish  the  goals  of  the  original  proposal.  The  project  has  resulted  in  the  following  products, 
publications,  and  recognitions: 

•  5  peer-reviewed  IEEE/ ACM  journal  publications 

•  12  peer-reviewed  IEEE/ ACM  conference  publications 

•  1  keynote  talk/paper  (at  the  IEEE  LPDC  2015  workshop) 

•  1  invited  special  session  (at  the  IEEE  VLSID  2016  conference) 

•  1  Best  Paper  Award  (at  ACM  GLSVLSI  2015  conference) 

•  1  Best  Paper  Finalist  (at  IEEE  ISQED  2016  conference) 

•  1  guest  edited  special  journal  issue  on  silicon  photonics  for  multicore  computing  (IEEE  D&T  2015) 

•  1  invited  book  chapter  (in  book  “Optical  Interconnects  for  Computer  Systems”,  2016) 

•  4  invited  seminar/workshop  talks 
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In  the  area  of  innovative  optoelectronic  network  topologies  and  protocols  at  the  architecture -level  and  CAD 
exploration  tools  for  future  SoC  chip  platforms,  we  have  made  the  following  contributions: 

•  In  [Bahirat  and  Pasricha,  2014a],  we  proposed  and  explored  a  novel  hybrid  ring-mesh 
electrophotonic  NoC  fabric  (METEOR)  for  emerging  chip  multiprocessors  (CMPs)  based  on 
advances  in  integrating  nanoscale  silicon  photonics  with  commercial  CMOS  manufacturing 
technology.  Our  proposed  fabric  consists  of  a  photonic  ring  waveguide  that  acts  as  a  global 
communication  channel  and  complements  a  more  traditional  2D  electrical  NoC  fabric.  This  hybrid 
communication  architecture  utilizes  electrical  and  photonic  paths  simultaneously  to  improve  the 
performance-per-watt  characteristics  of  a  CMP.  We  explore  different  architectural  configurations 
of  our  hybrid  photonic  NoC  fabric  by  considering:  (i)  varying  levels  of  electrical  to  photonic 
communication  connectivity,  (ii)  multiple  degrees  of  communication  serialization,  and  (iii) 
different  levels  of  photonic  wavelength  division  multiplexing.  These  configurations  enable 
interesting  trade-offs  between  performance  and  power  consumption  in  the  proposed  architecture. 
Our  experimental  results  indicate  significant  potential  for  METEOR  as  it  can  provide  about  5x 
reduction  in  power  consumption  and  improvements  in  throughput  and  access  latencies,  compared 
to  traditional  electrical  2D  mesh  and  torus  NoCs.  Our  proposed  METEOR  fabric  also  demonstrates 
lower  photonic-layer  area  cost,  power  consumption,  and  energy-delay  product,  while  maintaining 
competitive  communication  latency  and  throughput  compared  to  previously  proposed  hybrid 
photonic  NoC  fabrics,  such  as  the  hybrid  photonic  torus,  the  all-optical  Corona  crossbar,  and  the 
hybrid  hierarchical  Firefly  crossbar. 

•  The  key  challenges  for  waveguide  photonics  include:  (i)  high  complexity  and  overhead  of  thermally 
tuning  microring  resonators  to  ensure  proper  coupling  of  wavelengths,  (ii)  high  power  footprint 
due  to  significant  waveguide  crossing,  propagation,  and  bending  losses,  (iii)  need  for  complex 
tapered  structures  and  optimized  grating  couplers  with  high  coupling  efficiency,  and  (iv)  0.5-3  pm 
inter-waveguide  spacing  requirements  to  avoid  crosstalk  that  can  lead  to  lower  bandwidth  density 
than  in  optimized  electrical  wires.  To  overcome  these  challenges  with  waveguide  photonics,  free- 
space  nanophotonics  based  on  GaAs/AlAs  dense  Multiple  Quantum  Well  (MQW)  devices  have 
recently  been  proposed  as  an  alternative.  Such  free-space  configurations  can  be  integrated  with 
standard  CMOS  fabrication  processes  and  are  better  suited  for  high-density  optical  interconnects 
due  to  their  small  active  area  and  improved  misalignment  tolerance.  MQW  devices  are  projected 
to  consume  less  than  1  pj/bit  energy  and  can  be  configured  either  as  absoiption  modulators  or 
photo-detectors  (PDs).  On-chip  optical  interconnects  utilizing  MQWs  can  operate  at  40  Gbps 
bandwidth  to  instantiate  single -hop  or  multi-hop  transfers  through  free-space  optical  links.  MQW 
modulators  provide  significant  potential  to  get  around  the  thermal  tuning  challenges  of  silicon 
microring  resonators  and  can  be  fabricated  in  various  angles  to  achieve  out-of-plane  beam  steering 
directions.  In  [Bahirat  and  Pasricha,  2014b],  we  proposed  a  novel  system-level  framework 
(HELIX)  to  synthesize  application-specific  hybrid  (electrical  and  free-space  photonics)  NoC 
fabrics.  HELIX  integrates  graph  based  algorithms,  linear  programming,  and  custom  heuristics  to 
enable  rapid  design  space  exploration  and  application-specific  customization  of  hybrid  electro¬ 
photonic  NoC  fabrics  for  many-core  chip  architectures.  Based  on  our  experimental  studies,  we 
demonstrate  that  the  proposed  techniques  in  the  HELIX  framework  produce  a  superior  NoC 
architecture  that  satisfies  all  performance  requirements  for  MiBench  multi-application  workloads 
and  PARSEC  multi-threaded  workloads,  while  achieving  an  average  of  3.06x  reduction  in  power 
dissipation  across  SoC  platforms  of  varying  complexity,  compared  to  previously  proposed 
application-specific  electrical-only  NoC  synthesis  frameworks. 

•  In  [Bahirat  and  Pasricha,  2014c],  we  presented  the  3D-HELIX  framework  to  synthesize 
heterogeneous  application-specific  hybrid  (free  space)  nanophotonic -electric  3D  NoCs  for 
emerging  3D  chip  multiprocessors.  Based  on  our  experimental  studies,  we  demonstrated  that  the 
proposed  techniques  in  the  3D-HELIX  framework  produce  a  superior  hybrid  nanophotonic -electric 
3D  NoC  architecture  that  satisfies  all  performance  requirements  for  multi-application  workloads, 
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while  achieving  an  average  from  2.5x  to  6x  reduction  in  power  for  multi-layer  small,  medium  and 
large  sized  3D-NoC  based  heterogeneous  3D  CMP  architectures,  compared  to  synthesized 
application-specific  electrical  3D  NoCs. 

•  In  [Chittamuru,  Desai,  Pasricha,  2015]  we  proposed  the  UltraNoC  photonic  NoC  architecture  that 
features  improved  channel  sharing  among  cores  by  using  an  aggressive  concurrent  token  stream- 
based  arbitration  strategy.  UltraNoC  utilizes  multiple -write-multiple-read  (MWMR)  photonic 
waveguides  in  a  crossbar  topology,  and  supports  dynamic  performance  adaptation  to  aggressively 
utilize  network  bandwidth  and  meet  diverse  application  demands.  UltraNoC  also  supports  the 
ability  to  dynamically  transfer  bandwidth  between  clusters  of  cores  and  re-prioritize  multiple  co¬ 
running  applications  to  further  improve  channel  utilization  and  adapt  to  time -varying  application 
performance  goals.  Our  architecture  improves  throughput  by  up  to  9.8x,  latency  by  up  to  55%  and 
EDP  by  up  to  90%  over  traditional  electrical  and  state-of-the-art  photonic  NoC  architectures  with 
the  best-known  arbitration  mechanisms.  UltraNoC  also  scales  well  with  increasing  core  counts  on 
a  chip,  and  reduces  crosstalk  in  photonic  channels  to  enhance  communication  reliability.  This 
paper  received  the  Best  Paper  Award  at  the  ACM  GLSVLSI 2015  conference. 

In  the  area  of  memory-access  enhancements  with  optically  connected  DRAM  modules,  we  have  made  the 

following  contributions: 

•  In  [Thakkar  and  Pasricha,  2014a]  we  introduced  3D-Wiz,  which  is  a  high  bandwidth,  low  latency, 
optically  interfaced  3D  DRAM  architecture  with  fine  grained  data  organization  and  activation.  3D- 
Wiz  integrates  sub-bank  level  3D  partitioning  of  the  data  array  to  enable  fine-grained  activation 
and  greater  memory  parallelism.  A  novel  method  of  routing  the  internal  memory  bus  using  TSVs 
and  fan-out  buffers  enables  3D-Wiz  to  use  smaller  dimension  subarrays  without  significant  area 
overhead.  This  in  turn  reduces  the  random  access  latency  and  activation-precharge  energy.  3D- 
Wiz  demonstrates  access  latency  of  19.5ns  and  row  cycle  time  of  25ns.  It  yields  per  access 
activation  energy  and  precharge  energy  of  0.78nJ  and  0.62nJ  respectively  with  42.5%  area 
efficiency.  3D-Wiz  yields  the  best  latency  and  energy  consumption  values  per  access  among  other 
well-known  3D  DRAM  architectures.  Experimental  results  with  PARSEC  benchmarks  indicate 
that  3D-Wiz  achieves  38.8%  improvement  in  performance,  81.1%  reduction  in  power 
consumption,  and  77.1%  reduction  in  energy-delay  product  (EDP)  on  average  over  3D  DRAM 
architectures  from  prior  work. 

•  In  [Thakkar  and  Pasricha,  2015b]  we  introduced  3D-ProWiz  (an  enhanced  version  of  3D-Wiz), 
which  is  a  high-bandwidth,  energy-efficient,  optically-interfaced  3D  DRAM  architecture  with  fine 
grained  data  organization  and  activation.  3D-ProWiz  integrates  sub-bank  level  3D  partitioning  of 
the  data  array  to  enable  fine-grained  activation  and  greater  memory  parallelism.  A  novel  method 
of  routing  the  internal  memory  bus  to  individual  subarrays  using  TSVs  and  fanout  buffers  enables 
3D-ProWiz  to  use  smaller  dimension  subarrays  without  significant  area  overhead.  The  use  of  TSVs 
at  subarray-level  granularity  eliminates  the  need  to  use  slow  and  power  hungry  global  lines,  which 
in  turn  reduces  the  random  access  latency  and  activation-precharge  energy.  3D-ProWiz  yields  the 
best  latency  and  energy  consumption  values  per  access  among  other  well-known  3D  DRAM 
architectures.  Experimental  results  with  PARSEC  benchmarks  indicate  that  3D-ProWiz  achieves 
41.9%  reduction  in  average  latency,  52%  reduction  in  average  power,  and  80.6%  reduction  in 
energy-delay  product  (EDP)  on  average  over  DRAM  architectures  from  prior  work. 

•  In  [Thakkar  and  Pasricha,  2015c]  we  presented  a  novel  Wide-I/O  DRAM  architecture  called  3D- 
WiRED,  with  an  enhanced  DRAM  core  to  enable  low  latency  and  energy-efficient  optically 
interfaced  memory  access.  Through  detailed  time-energy  analysis  of  a  Wide-I/O  DRAM 
prototype,  we  have  identified  the  need  to  reduce  the  capacitance  of  bitlines,  memory  bus,  and 
global  data  path  to  reduce  random  access  latency,  read/write  energy,  and  activation-precharge 
energy.  We  reorganize  DRAM  banks  and  utilize  a  TSV-based  internal  memory  bus  to  achieve 
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reduced  capacitance  of  the  data  access  path  with  increased  area-efficiency.  We  presented  detailed 
breakdowns  of  timing  and  energy  for  the  prototype  Wide-I/O  DRAM,  through  which  we  identify 
the  key  components  of  DRAM  organization  that  most  significantly  affect  overall  latency  and 
energy  of  the  DRAM  subsystem.  We  modeled  and  studied  two  variants  of  the  state-of-the-art 
Wide-I/O  and  3D-SWIFT  DRAMs,  to  derive  an  optimum  combination  of  critical  enhancements  to 
make  in  Wide-I/O  bank  organization  that  would  achieve  combined  benefits  in  performance, 
energy-efficiency,  cost  and  area.  We  employed  large-aspect-ratio  subarrays  to  reduce  bitline 
capacitance,  with  high  area  efficiency.  Reduced  bitline  capacitance  reduces  row  cycle  time  and 
activation-precharge  energy  which  relaxes  the  power  constraint  and  increases  bank-level 
parallelism.  We  reorganized  3D-WiRED  hanks  by  using  a  TSV-based  internal  memory  bus  to 
eliminate  global  wordlines  and  datalines,  which  reduces  access  time  and  read/write  energy. 
Experimental  results  indicate  that  our  proposed  3D-WiRED  DRAM  architecture  yields  on  average 
31.2%,  32.9%,  and  52.8%  improvements  in  energy-per-bit,  average -latency,  and  energy-delay- 
product  (EDP)  over  state-of-the-art  Wide  I/O  and  3D-SWIFT  DRAM  architectures. 

•  In  [Thakkar  and  Pasricha,  2015a],  we  presented  3D-SGDRAM,  a  new  3D-stacked  graphics  DRAM 
architecture  with  optical  interfacing  for  GPU-centric  processing  systems.  3D-SGDRAM  employs 
a  new  bitline  interface  and  a  bank  organization  based  on  detailed  parameter  characterization  and 
optimization  to  achieve  simultaneous  improvements  in  performance,  throughput,  power,  and  area 
of  the  DRAM  core.  We  modified  the  bitline  interface  of  the  DRAM  core  to  enable  access  to  only 
a  selective  group  of  bitlines  in  all  active  subarrays  during  a  memory  transaction,  which  helps 
optimize  page  size  and  related  architectural  parameters.  We  characterized  the  interdependence 
between  various  architectural  parameters  of  the  3D-SGDRAM  bank  organization  and  optimize 
these  parameters,  to  achieve  benefits  in  performance,  power,  throughput  and  area.  Experimental 
results  with  CUDA  benchmarks  indicate  that  3D-SGDRAM  yields  57.5%,  77.7%,  and  45.2% 
improvements  in  power,  latency,  and  energy-delay  product  (EDP)  on  average  over  state-of-the-art 
GDDR5  and  GDDR5M  solutions. 

•  In  [Thakkar  and  Pasricha,  2016]  we  presented  a  novel,  energy-efficient  DRAM  refresh  technique 
called  massed  refresh  that  simultaneously  leverages  bank-level  and  subarray-level  concurrency  to 
reduce  the  overhead  of  distributed  refresh  operations  in  the  Hybrid  Memory  Cube  (HMC)  and 
other  optically-connected  DRAM  architectures.  In  massed  refresh ,  a  bundle  of  DRAM  rows  in  a 
refresh  operation  is  composed  of  two  subgroups  mapped  to  two  different  banks,  with  the  rows  of 
each  subgroup  mapped  to  different  subarrays  within  the  corresponding  bank.  Both  subgroups  of 
DRAM  rows  are  refreshed  concurrently  during  a  refresh  command,  which  greatly  reduces  the 
refresh  cycle  time  and  improves  bandwidth  and  energy  efficiency  of  the  HMC.  Our  experimental 
analysis  shows  that  the  proposed  massed  refresh  technique  achieves  up  to  6.3%  and  5.8% 
improvements  in  throughput  and  energy-delay  product  on  average  over  JEDEC  standardized 
distributed  per-bank  refresh  and  state-of-the-art  scattered  refresh  techniques. 


In  the  area  of  device-level  characterization  and  circuit-level  enhancements  for  optoelectronic  components 
to  improve  reliability,  energy-efficiency,  and  performance,  we  have  made  the  following  contributions: 

•  Microring-resonators  (MRs),  which  are  the  basic  building  blocks  of  PNoCs,  are  highly  susceptible 
to  crosstalk  that  can  notably  degrade  optical-signal-to-noise  ratio  (SXR),  reducing  reliability  in 
PNoCs.  We  observed  that  when  transmitting  data  in  PNoCs,  crosstalk  noise  in  MRs  depends  on  the 
characteristics  of  data  values  propagating  in  the  photonic  waveguide.  Therefore  in  [Chittamuru  and 
Pasricha,  2015a]  we  proposed  novel  techniques  to  intelligently  reduce  undesirable  data  value 
occurrences  in  a  photonic  waveguide.  These  techniques  are  easily  implementable  in  any  existing 
DWDM-based  photonic  crossbar  without  requiring  major  modifications  to  the  architectures,  unlike 
previously  proposed  crosstalk  mitigation  techniques  that  are  targeted  to  reduce  crosstalk  in  specific 
architectures  by  requiring  modifications  to  their  router  designs.  We  designed  a  crosstalk  mitigation 
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technique  with  5-bit  encoding  (PCTM5B)  to  improve  worst-case  SXR  for  DWDM-based  photonic 
crossbar  PNoCs.  We  also  introduced  another  crosstalk-mitigation  scheme  with  6-bit  encoding 
(PCTM6B),  that  more  aggressively  improves  SXR  but  with  relatively  higher  EDP  overhead.  Our 
evaluation  results  indicate  that  the  encoding  schemes  improve  worst-case-SXR  in  Corona  and 
Firefly  PNoCs  by  up  to  18%. 

•  In  [Chittamuru  and  Pasricha,  20 1 5b]  we  observed  that  for  a  fixed  free  spectral  range  (FSR),  increase 
in  DWDM  of  the  waveguide  leads  to  reduction  in  wavelength  spacing  between  two  adjacent 
wavelengths  and  this  in  turn  increases  crosstalk  noise.  From  transmission  spectrums  of  cascaded 
MRs,  it  can  be  seen  that  overlapping  region  between  adjacent  wavelengths  decreases  with  increase 
in  the  wavelength  spacing;  this  in  turn  reduces  crosstalk  noise.  Thus  SNR  in  DWDM  based 
photonic  crossbars  is  directly  related  to  the  available  DWDM  in  its  waveguides.  Thus,  we  proposed 
a  novel  wavelength  spacing  (WSP)  techniques  to  increase  spacing  between  adjacent  wavelengths 
in  a  DWDM  waveguide  for  PNoCs.  Experimental  results  on  two  photonic  crossbar  architectures 
(Corona  and  Firefly)  indicate  that  our  approach  improves  worst-case  signal-to-noise  ratio  (SNR) 
by  up  to  51.7%. 

•  Photonic  network-on-chip  (PNoC)  architectures  typically  employ  dense  wavelength  division 
multiplexing  (DWDM)  for  high  bandwidth  transfers.  Unfortunately,  DWDM  increases  crosstalk 
noise  and  decreases  optical  signal  to  noise  ratio  (SNR)  in  microring  resonators  (MRs)  threatening 
the  reliability  of  data  communication.  Additionally,  process  variations  induce  variations  in  the 
width  and  thickness  of  MRs  causing  shifts  in  resonance  wavelengths  of  MRs,  which  further  reduces 
signal  integrity,  leading  to  communication  errors  and  bandwidth  loss.  In  [Chittamuru,  Thakkar, 
Pasricha,  2016b],  we  proposed  a  novel  encoding  mechanism  that  intelligently  adapts  to  on-chip 
process  variations,  and  improves  worst-case  SNR  by  reducing  crosstalk  noise  in  MRs  used  within 
DWDM-based  PNoCs.  Experimental  results  on  the  Corona  PNoC  architecture  indicate  that  our 
approach  improves  worst -case  SNR  by  up  to  44.13%.  This  paper  was  a  Best  Paper  Finalist  at 
the  IEEE  ISQED  2016  conference. 

•  In  [Chittamuru,  Thakkar,  Pasricha,  2016a],  we  presented  a  novel  crosstalk  mitigation  framework 
called  PICO  to  enable  reliable  communication  in  emerging  PNoC -based  multicore  systems.  PICO 
mitigates  the  effects  of  1M  crosstalk  by  controlling  signal  loss  of  wavelengths  in  the  waveguide  and 
reduces  trimming-induced  crosstalk  by  intelligently  reducing  undesirable  data  value  occurrences 
in  a  photonic  waveguide  based  on  the  PV  profile  of  MRs.  Our  framework  has  low  overhead  and  is 
easily  implementable  in  any  existing  DWDM-based  PNoC  without  major  modifications  to  the 
architecture.  To  the  best  of  our  knowledge,  this  is  the  first  work  that  attempts  to  improve  SNR  in 
PNoCs  considering  both  1M  effects  and  PV  in  its  MRs.  We  presented  device-level  analytical 
models  to  capture  the  deleterious  effects  of  localized  trimming  in  MRs.  Moreover,  we  extended 
this  model  for  system-level  heterodyne  crosstalk  analysis.  We  proposed  a  scheme  for  IM  passband 
truncation-aware  heterodyne  crosstalk  mitigation  (IMCM)  to  improve  worst-case  SNR  of  MRs  by 
controlling  non-resonant  signal  power.  We  proposed  a  scheme  for  process  variation  (PV)-aware 
heterodyne  crosstalk  mitigation  (PVCM)  to  improve  worst-case  SNR  of  detector  MRs  by  encoding 
data  to  avoid  undesirable  data  occurrences.  Experimental  results  indicate  that  our  approach  can 
improve  the  worst-case  SNR  by  up  to  4.4x  and  significantly  enhance  the  reliability  of  DWDM- 
based  PNoC  architectures. 


In  the  area  of  energy-efficient  runtime  reconfiguration  of  optoelectronic  components,  we  have  made  the 
following  contributions: 

•  The  operation  of  photonic  NoCs  (PNoCs)  is  very  sensitive  to  temperature  variations  that  frequently 

occur  on  a  chip.  These  variations  can  create  significant  reliability  issues  for  PNoCs.  For  example, 
a  microring  resonator  (MR)  may  resonate  at  another  wavelength  instead  of  its  designated 
wavelength  due  to  thermal  variations,  which  can  lead  to  bandwidth  wastage  and  data  corruption  in 
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PNoCs.  In  [Chittamuru  and  Pasricha,  2016c]  we  proposed  a  novel  run-time  framework  called 
SPECTRA  to  overcome  temperature-induced  reliability  issues  in  PNoCs.  The  framework  consists 
of  (i)  a  device-level  reactive  MR  assignment  mechanism  that  dynamically  assigns  a  group  of  MRs 
to  reliably  modulate/receive  data  in  a  waveguide  based  on  the  chip  thermal  profile;  and  (ii)  a 
system-level  proactive  thread  migration  technique  to  avoid  on-chip  thermal  threshold  violations 
and  reduce  MR  tuning/trimming  power  by  dynamically  migrating  threads  between  cores. 
Experimental  results  indicate  that  SPECTRA  can  satisfy  on-chip  thermal  thresholds  and  maintain 
high  NoC  bandwidth  while  reducing  total  power  by  up  to  6 1  %,  and  thermal  tuning/trimming  power 
by  up  to  71%  over  state-of-the-art  thermal  management  solutions. 


Future  Directions 

We  envision  the  following  directions  for  future  research  in  this  area  that  can  solve  some  of  the  new 
challenges  we  have  uncovered  as  part  of  this  project: 

•  CMOS-compatible  integrated  photonics  devices  are  very  susceptible  to  various  sources  of 
uncertainty,  such  as  process  variations,  thermal  variations,  voltage  fluctuations,  circuit/device 
aging,  and  soft-errors.  There  is  a  need  for  a  holistic  framework  to  overcome  uncertainty  and  ensure 
predictable  behavior  of  photonic  devices  at  the  device,  circuit  and  architecture  levels. 

•  There  is  a  need  for  new  CAD  tools  that  can  allow  for  cross-layer  exploration  of  integrated  silicon 
photonic  architectures,  to  balance  the  multiple  objectives  of  performance,  energy-efficiency, 
reliability,  cost,  and  security  with  optoelectronic  networks. 

•  The  area  of  memory-NoC  co-optimization  shows  great  promise.  As  memory  components  determine 
a  majority  of  the  traffic  characteristics  in  NoCs,  co-design  and  co-optimization  of  the  NoC  and 
memory  becomes  essential  to  manage  reliability,  energy/power,  and  performance.  There  is  a  need 
for  new  optically  interfaced  memory  architectures  that  are  co-optimized  with  opto-electronic 
networks  at  the  chip  level. 
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